7 research outputs found

    NEMO: Extraction and normalization of organization names from PubMed affiliations

    Get PDF
    Background: We are witnessing an exponential increase in biomedical research citations in PubMed. However, translating biomedical discoveries into practical treatments is estimated to take around 17 years, according to the 2000 Yearbook of Medical Informatics, and much information is lost during this transition. Pharmaceutical companies spend huge sums to identify opinion leaders and centers of excellence. Conventional methods such as literature search, survey, observation, self‐identification, expert opinion, and sociometry not only need much human effort, but are also non‐comprehensive. Such huge delays and costs can be reduced by “connecting those who produce the knowledge with those who apply it”. A humble step in this direction is large‐scale discovery of persons and organizations involved in specific areas of research. This can be achieved by automatically extracting and disambiguating author names and affiliation strings retrieved through Medical Subject Heading (MeSH) terms and other keywords associated with articles in PubMed. In this study, we propose NEMO (Normalization Engine for Matching Organizations), a system for extracting organization names from the affiliation strings provided in PubMed abstracts, building a thesaurus (list of synonyms) of organization names, and subsequently normalizing them to a canonical organization name using the thesaurus. Results: We used a parsing process that involves multi‐layered rule matching with multiple dictionaries. The normalization process involves clustering based on weighted local sequence alignment metrics to address synonymy at word level, and local learning based on finding connected components to address synonymy. The graphical user interface and java client library of NEMO are available at http://lnxnemo.sourceforge.net. Conclusion: NEMO associates each biomedical paper and its authors with a unique organization name and the geopolitical location of that organization. This system provides more accurate information about organizations than the raw affiliation strings provided in PubMed abstracts. It can be used for : a) bimodal social network analysis that evaluates the research relationships between individual researchers and their institutions; b) improving author name disambiguation; c) augmenting National Library of Medicine (NLM)’s Medical Articles Record System (MARS) system for correcting errors due to OCR on affiliation strings that are in small fonts; and d) improving PubMed citation indexing strategies (authority control) based on normalized organization name and country

    Massive-scale Decoding for Text Generation using Lattices

    Full text link
    Conditional neural text generation models generate high-quality outputs, but often concentrate around a mode when what we really want is a diverse set of options. We present a search algorithm to construct lattices encoding a massive number of generation options. First, we restructure decoding as a best-first search, which explores the space differently than beam search and improves efficiency by avoiding pruning paths. Second, we revisit the idea of hypothesis recombination: we can identify pairs of similar generation candidates during search and merge them as an approximation. On both summarization and machine translation, we show that our algorithm encodes thousands of diverse options that remain grammatical and high-quality into one lattice. This algorithm provides a foundation for building downstream generation applications on top of massive-scale diverse outputs.Comment: NAACL 2022, see https://github.com/jiacheng-xu/lattice-generation for cod

    Automatically extracting sentences from Medline citations to support clinicians' information needs

    Get PDF
    Online health knowledge resources contain answers to most of the information needs raised by clinicians in the course of care. However, significant barriers limit the use of these resources for decision-making, especially clinicians’ lack of time. In this study we assessed the feasibility of automatically generating knowledge summaries for a particular clinical topic composed of relevant sentences extracted from Medline citations

    Audio Recording Patient-Nurse Verbal Communications in Home Health Care Settings: Pilot Feasibility and Usability Study

    No full text
    BackgroundPatients’ spontaneous speech can act as a biomarker for identifying pathological entities, such as mental illness. Despite this potential, audio recording patients’ spontaneous speech is not part of clinical workflows, and health care organizations often do not have dedicated policies regarding the audio recording of clinical encounters. No previous studies have investigated the best practical approach for integrating audio recording of patient-clinician encounters into clinical workflows, particularly in the home health care (HHC) setting. ObjectiveThis study aimed to evaluate the functionality and usability of several audio-recording devices for the audio recording of patient-nurse verbal communications in the HHC settings and elicit HHC stakeholder (patients and nurses) perspectives about the facilitators of and barriers to integrating audio recordings into clinical workflows. MethodsThis study was conducted at a large urban HHC agency located in New York, United States. We evaluated the usability and functionality of 7 audio-recording devices in a laboratory (controlled) setting. A total of 3 devices—Saramonic Blink500, Sony ICD-TX6, and Black Vox 365—were further evaluated in a clinical setting (patients’ homes) by HHC nurses who completed the System Usability Scale questionnaire and participated in a short, structured interview to elicit feedback about each device. We also evaluated the accuracy of the automatic transcription of audio-recorded encounters for the 3 devices using the Amazon Web Service Transcribe. Word error rate was used to measure the accuracy of automated speech transcription. To understand the facilitators of and barriers to integrating audio recording of encounters into clinical workflows, we conducted semistructured interviews with 3 HHC nurses and 10 HHC patients. Thematic analysis was used to analyze the transcribed interviews. ResultsSaramonic Blink500 received the best overall evaluation score. The System Usability Scale score and word error rate for Saramonic Blink500 were 65% and 26%, respectively, and nurses found it easier to approach patients using this device than with the other 2 devices. Overall, patients found the process of audio recording to be satisfactory and convenient, with minimal impact on their communication with nurses. Although, in general, nurses also found the process easy to learn and satisfactory, they suggested that the audio recording of HHC encounters can affect their communication patterns. In addition, nurses were not aware of the potential to use audio-recorded encounters to improve health care services. Nurses also indicated that they would need to involve their managers to determine how audio recordings could be integrated into their clinical workflows and for any ongoing use of audio recordings during patient care management. ConclusionsThis study established the feasibility of audio recording HHC patient-nurse encounters. Training HHC nurses about the importance of the audio-recording process and the support of clinical managers are essential factors for successful implementation
    corecore